From Html to Usable Data Problems in Meaning and Credibility in the Www Position Paper

نویسندگان

  • Joshua Grass
  • Shlomo Zilberstein
چکیده

Problems in meaning and credibility in the WWW Position paper Joshua Grass and Shlomo Zilberstein Computer Science Department University of Massachusetts Amherst, MA 01003 U.S.A. fjgrass,[email protected] Introduction This paper describes issues in information integration that relate to Value-Driven Information Gathering (VDIG)(Grass & Zilberstein 1996; 1997; 1998). Valuedriven information gathering is the process of querying multiple information sources for information items which are used to make a decision. VDIG works in a resource-bounded environment where it is not possible to gather all the information needed to make a perfect decision. Instead, VDIG keeps statistics on the response expectation of particular sites and the decision model can operate with partial information. The process is referred to as value-driven, because the algorithm determines the value of a query for potential sites and queries the best candidate. The value of a query is determined using the value of information from the decision model, the expectations of a site returning a result at any time in the future, the information the system already knows, and the cost function, which represents the resources the system is allowed to spend in order to make the decision. In this paper we will focus on one aspect of the value-driven process, taking raw information from sites and converting it into a form usable by our decision model1. The decision model we use is an in uence diagram which uses information passed to it from an extraction engine to instantiate nodes. At the present time we rely on hand coding extraction algorithms that convert web sites into a list of feature/value tuples. For our prototype system this approach works well, and we have received good results after testing the system in the domain of making a decision about purchasing a digital camera. Numerous other groups Support for this work was provided in part by the National Science Foundation under grants IRI-9624992, IRI9634938, and INT-9612092. 1It should be noted that although we are dealing with unifying information from distinct information sources in the context of value-driven information gathering, these techniques are valid in domains with few information sources. are developing much more open-ended extraction engines (Doorenbos, Etzioni, & Weld 1997; Ashish & Knoblock 1997a; 1997b; Konopnicki & Shmueli 1995; Genesereth, Keller, & Mueller 1996). Figure 1 shows the in uence diagram use by VDIG to evaluate digital cameras.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From HTML to usable data Problems in meaning and credibility in the WWW

This paper describes issues in information integration that relate to Value-Driven Information Gathering (VDIG)(Grass & Zilberstein 1996; 1997; 1998). Valuedriven information gathering is the process of querying multiple information sources for information items which are used to make a decision. VDIG works in a resource-bounded environment where it is not possible to gather all the information...

متن کامل

The Position of the human rights components in the contents of Iran elementary education Textbooks

The present study is about one of the most important contemporary issues in education and curriculum development, namely “human rights education”. By using content analysis, 36 textbooks of 2012-2013 school year with an overall of 3924 pages were studied and analyzed. For the analysis of the data, Shannon's entropy method derived from the theory of systems was used to obtain the credibility rat...

متن کامل

Identifying Credibility Criteria in Scholarly Communication (Reading and Citing) form the Standpoints of Faculty Members of Kharazmi University

Background and Aim: In effect, every scientific endeavor consisted of scientific communication and scientists’ involvement in particular field of study; and scientific board members as the most outstanding elements play a key role in scientific productions. Therefore, a constructive scientific communication requires obtaining credible and valid information. In so doing, this study tries to inve...

متن کامل

Estimation of LOS Rates for Target Tracking Problems using EKF and UKF Algorithms- a Comparative Study

One of the most important problem in target tracking is Line Of Sight (LOS) rate estimation for using from PN (proportional navigation) guidance law. This paper deals on estimation of position and LOS rates of target with respect to the pursuer from available noisy RF seeker and tracker measurements. Due to many important for exact estimation on tracking problems must target position and Line O...

متن کامل

A Framework For Extracting Information From Web Using VTD-XML‘s XPath

The exponential growth of WWW (World Wide Web) is the cause for vast pool of information as well as several challenges posed by it, such as extracting potentially useful and unknown information from WWW. Many websites are built with HTML, because of its unstructured layout, it is difficult to obtain effective and precise data from web using HTML. The advent of XML (Extensible Markup Language) p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007